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1  Statement  of  the  problems  studied. 

The  primary  goal  of  this  project  was  to  develop  new  statistical  methods  and  study 
their  theoretical  underpinnings  to  meet  the  challenges  of  evolving  data  analysis  problems 
that  the  army  encounters.  These  statistical  methods  also  have  applications  in  biology, 
medicine  and  related  sciences.  Our  research  is  focused  on  three  important  problems: 
(1)  study  mathematical  details  of  a  new  statistical  method  that  we  have  developed  for 
analyzing  longitudinal  and  clustered  data,  (2)  develop  bivariate  models  for  gene  expres¬ 
sion  data  to  identify  differentially  expressed  genes  in  microarrays,  (3)  study  invariance 
properties  of  test  statistics  that  occur  in  multivariate  analysis  of  variance. 


2  Summary  of  the  most  important  results. 

2.1  Quasi-least  squares. 

A  major  accomplishment  of  this  project  is  the  development  of  a  new  statistical  method  for 
analyzing  longitudinal  or  repeated  measurements  data.  Such  data  naturally  occur  when 
repeated  observations  are  taken  on  individuals,  or  the  data  is  taken  on  clusters  or  groups 
of  subjects  sharing  similar  characteristics.  In  a  landmark  paper,  Liang  and  Zeger  (1986, 
Biometrika,  73,  13-22)  introduced  the  generalized  estimating  equations  (GEE)  for  ana¬ 
lyzing  longitudinal  data.  The  GEE  method  has  become  so  popular  that  the  1986  article 
of  Liang  and  Zeger  was  included  in  Volume  3  of  “Breakthrough  in  Statistics.”  But  despite 
its  popularity,  the  method  has  some  significant  problems,  particularly  in  estimation  of 
correlations  between  the  repeated  measurements.  In  this  project  we  developed  an  im¬ 
proved  method  of  estimating  the  correlations;  the  quasi-least  squares  method.  Using 
the  asymptotic  relative  efficiency  criterion  we  have  shown  that  the  quasi-least  squares 
estimates  are  good  competitors  to  the  maximum  likelihood  estimates  obtained  under  the 
assumption  of  normality.  Further,  using  simulations  we  have  shown  that  the  quasi-least 
squares  estimates  are  robust  and  insensitive  to  the  assumption  of  normality.  They  are 
undoubtedly  better  than  the  moment  estimates. 
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2.2  Analysis  of  microarray  data. 


In  recent  years,  technology  has  created  a  major  revolution  in  biology  and  microbiology 
research.  The  revolution  was  made  possible  by  the  extensive  use  of  the  new  inexpen¬ 
sive  and  high  throughput  chips,  for  example  protein  chips,  mass  spectrometry  and  mi¬ 
croarrays.  In  particular,  advances  in  microarray  technology  are  enabling  researchers  to 
quantitatively  analyze  expression  levels  of  thousands  of  genes  simultaneously.  During 
the  early  years  of  microarray  studies,  researchers  relied  mainly  on  traditional  methods 
for  analyzing  gene  expression  data,  including  but  not  limited  to,  hierarchical  clustering 
and  t-tests.  These  methods,  known  to  perform  well  for  small  data  sets,  have  been  only 
partially  successful  for  large  data.  In  this  project  we  have  developed  new  models  to  an- 
alyze  gene  expressions  from  microarrays.  Our  models,  which  account  for  the  correlation 
between  measured  intensities  of  the  control  and  cancerous  tissues,  are  useful  to  calculate 
the  posterior  odds  of  gene  expressions,  and  to  select  highly  differentially  expressed  genes. 


2.3  Matrix  quadratic  forms. 

The  popular  statistical  tests  in  multivariate  analysis  of  variance  are  based  on  Cochran’s 
theorem,  which  assumes  that  the  samples  are  taken  independently  from  normal  popula¬ 
tions.  Much  research  has  been  done  on  the  extensions  of  Cochran’s  theorem  for  matrix 
quadratic  forms.  However,  these  extensions  still  assume  that  observations  within  the 
samples  are  independent.  In  this  research  project  we  derived  simple  versions  of  the 
Cochran’s  theorem  when  the  observations  within  each  sample  are  correlated  with  covari¬ 
ance  matrix  S.  In  particular  we  have  derived  necessary  and  sufficient  conditions  such 
that  common  matrix  quadratic  forms  are  independent  and  distributed  as  Wishart  in 
the  cases  where  S  is  the  Kronecker  product  of  two  nonnegative  definite  matrices,  and 
an  arbitrary  nonnegative  definite  matrix.  We  have  used  the  results  to  characterize  the 
class  of  nonnegative  definite  matrices  such  that  the  matrix  quadratic  forms  that  occur  in 
multivariate  analysis  of  variance  are  independent  and  Wishart  except  for  a  scale  factor. 
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